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^vj I We propose a class of estimators for the parameters of a GARCH(p,q) 

^sj . sequence. We show that our estimators are consistent and asymptot- 

ically normal under mild conditions. The quasi-maximum likelihood 
and the likelihood estimators are discussed in detail. We show that 
the maximum likelihood estimator is optimal. If the tail of the dis- 
C/j ' tribution of the innovations is polynomial, even a quasi-maximum 

(~| ■ likelihood estimator based on exponential density performs better 

than the standard normal density-based quasi-likelihood estimator 
of Lee and Hansen and Lumsdaine. 
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1. Introduction. The generalized autoregressive conditional heteroscedas- 
tic (GARCH) process was introduced by Bollerslev (1986). The GARCH 



f\J I process has received considerable attention from applied as well as from the- 

^ ' oretical points of view. We say that {yk,—oo <k < oo} is a GARCH(p, g) 

^ ■ process if it satisfies the equations 
O . 

Tt ■ (1-1) yk = <ykek 

O 

^' (1-2) al=u;+ Y. aiyl_,+ Y. (^A 



and 



J' 
l<i<p i<i<g 



where 



\-<- (1.3) w>0, ai>0, l<i<p, /?j>0, l<j<5 
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2 I. BERKES AND L. HORVATH 

are constants. We also assume that 

, .. {ei, — oo < i < 00} are independent, 

identically distributed random variables. 

Throughout this paper we assume that (1.1)-(1.4) hold. 

The GARCH(1, 1) model was studied by Nelson (1991) who showed that 
(1.1) and (1.2) have a unique stationary solution if and only if i?log(/5i + 
tti^o) "^ 0- T^^^ general case was investigated by Bougerol and Picard (1992a, 
b). Let 

Tn = {(3i + aiel, ^2,..., Pq-i) G K"-!, 
^„ = (4,0,...,0)GM'?-i 
and 

a = (a2,...,ap-i)G]RP-^ 

[Clearly, without loss of generality, we may and shall assume min(p, g) > 2.] 
Define the (p + g — 1) x (p + 9 — 1) matrix An, written in block form, by 



A^. 



Tn Pg a Op 

Iy_l 

^n 

Ip_2 



where Ig_i and Ip_2 are the identity matrices of size q — 1 and p — 2, re- 
spectively. The norm of any d x d matrix M is defined by 

||M|| = sup{||Mx|U/||x|U : X G M^ X / 0}, 

where || • H^ is the usual (Euclidean) norm in M'^. The top Liapounov exponent 
7l associated with the sequence {An, —00 < n < 00} is 

7L = ^ inf — — -^logPo^i---^n||, 
i<7i<oo n + 1 

assuming that 

(1.5) ^(logPo||)<oo. 

Bougerol and Picard (1992a, b) showed that if (1.5) holds, then (1.1) and 
(1.2) have a unique stationary solution if and only if 

(1.6) 7L<0. 

The estimation of the parameter 6 = {uj,ai,. . . , Up, /?i, . . . , /3g) has been 
studied by several authors. Lee and Hansen (1994) and Lumsdaine (1996) 
used the quasi-maximum likelihood method to estimate the parameters 
from the sample yi,...,y„ in GARCH(1,1) models. The idea behind the 
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quasi-maximum likelihood method is the following. The likelihood func- 
tion is derived under the assumption that Eq is standard normal. The esti- 
mator is the point where the likelihood function reaches its largest value. 
The estimator in Lee and Hansen (1994) and Lumsdaine (1996) is "local" 
since the likelihood function is maximized in a small neighborhood of 9. 
They show that the quasi-maximum likelihood estimator is consistent and 
asymptotically normal without assuming the normality of eo- However, very 
strict conditions are assumed on the distribution of Eq and the value of 6. 
Berkes, Horvath and Kokoszka (2003) investigated the asymptotic proper- 
ties of the quasi-maximum likelihood estimator for 6 in GARCH(p, q) mod- 
els. Berkes, Horvath and Kokoszka (2003) obtained their asymptotic results 
under weak conditions. Berkes and Horvath (2003) showed that the quasi- 
maximum likelihood estimator cannot be n~^' ^-consistent if -E'|eo|'* = oo for 
some < K < 4. This shows the limitations of the quasi- maximum likelihood 
estimation method. The existence of the GARCH(p, q) sequence requires 
only that E\ log Eg | < co but the estimation works only if i?|eo|'' < oo with 
some K> 4. The quasi- maximum likelihood estimator does not use the dis- 
tribution of £q and therefore, as we shall see, it is not efficient. If Eeq = and 
Eeq = 1, then a^ is the conditional variance of yk given the past. However, 
without any moment conditions, a^ is the conditional scaling parameter of 

Vk- 

Since 0"^ is defined by a recursion, we use a recursion to define our esti- 
mator. Let u = (2;,s,t) G RP+^+i, x e M, s E M^ and t G M«. We start with 
the initial conditions: ii q>p, then 

C0(u)=x/(l-(tl+---+tg)), 

ci(u) = Si, 

C2(u) =S2+ilCl(u), 

Cp(u) = Sp + tiCp„i(u) H h tp_ici(u), 

Cp+i(u) =tiCp(u)H htpCi(u), 

Cg(u) =tlCg^l{u)-\ h VlCl(u), 

and if g < p, the equations above are replaced with 
co(u) = x/(l-(ii + --- + *,)), 
ci(u) = Sl, 

C2(u) = S2 + tlCl(u), 
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Cq+l (u) = Sg+i + tlCq(u) H h tgCl(u), 

Cp(u) = Sp + ilCp_l(u) H h tgCp-g{\l). 

In general, ii i> R = niax(p, g), then 

(1.7) Ci(u) = tiCj_l(u) + t2Q-2(u) H h tgQ_q(u). 

We choose an arbitrary positive function h and define 

^"(") = - E ^°g| .1/2, . Ml/fc/^'fc^^H)}: 
'^l<fc<n '^V (u) >! 

where 

u;fc(u) = co(u)+ ^ Ci{u)yl^i. 

l<i<k 

Let < u < u, < po < 1 ) Q1L< Po and define 

^ = {u : ti + t2 H \-tq< Po and 

n < min(x, si,S2,. ■ ■ ,Sp,ti,t2, ■ ■ ■ ,tq) 
< max(x, si, S2, ■ ■ ■ , Sp,ti,t2, ■ ■ ■ ,tg) <u} . 
From now on we replace (1.3) with the somewhat stronger condition 

(1.8) 9 is in the interior of U. 

We use I • I to denote the maximum norm of vectors and matrices. Let x V y = 
max(x,y). In this paper we study the asymptotic properties of 

On = argmaxL„(u). 

We note that L„(u) is a continuously differentiable function, so standard 
numerical methods can be used to compute 0„. 

In our first result we give a sufficient criterion for \6n — 0\ — > a.s. To 
state this result we will need some additional regularity conditions: 

the polynomials aix + a2x'^ + • • • + OpX^ and 

(1.9) 1 — Pix — f32x'^ — ... — /Jgx"^ are coprimes 

in the set of polynomials with real coefficients, 

(1.10) Eg is a nondegenerate random variable 
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and 

(1.11) limt~^P{eo<i} = 0, with some ^ > 0. 

Condition (1.8) is somewhat stronger than (1.3) but /3i + • • • + /3g < 1 is a 
necessary condition for the existence of a GARCH(p, q) sequence [cf. Berkes, 
Horvath and Kokoszka (2003)]. Assumptions (1.9) and (1.10) are needed to 
uniquely identify the parameter 6. So far all our conditions are related to 
the structure of the GARCH(p, q) process. The following set of conditions 
concerns the moments of Eq and the smoothness of h: 

(1.12) E\el\'^ <oo with some K>0, 
and there is < Co < oo such that 

(1.13) E\ logh{eot)\ < Coif" + 1) for all t > 0, with some < i/q < 2k. 
Let 

g{y, t) = log{th{yt)}, -cx) <y<oo, t > 0, 

and 

d 

9i{y,t) = -^g{y,t), -oo <y<oo, t>0. 

We also assume that there is a function Ci{y) such that 

\9i{y,t)\ < Ci{y){t''^ + l)/t for ah < t < oo and y G M, 
(1.14) 

with some < z^i < 2k, 

and 

(1.15) ECi{eo)<oo. 

If /i is a density, then condition (1.14) means that the density function th{yt) 
is smooth in the parameter t. 
We will show in Lemma 4.1 that 

L(u) = i?log|^— ^^^%o/(u;o(u))i/2. 

exists for all u € [/, where 

u;fc(u) = co(u)+ Yl (^M)yl-i- 

l<i<oo 

We note that 

The following condition will imply [see (4.6)] that L{\y) has a unique maxi- 
mum in U at 0: 

(1.16) Eg{eo,t) < Eg{eo,l) for all < t < oo, t / 1. 
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Theorem 1.1. // ("1.5;, ^1.6; and ("l.Sj-^l.ie; hold, then 

0„ — > a.s. 

The proof of Theorem 1.1 will be given in Section 4. 

Next we discuss the asymptotic normahty of n^'^(0„ — 6). We need fur- 
ther smoothness conditions on th{yt). Let g2{y,t) and g^{y.,t) be the second 
and third derivatives of g{y,t) with respect to t. We assume that there are 
functions C2 and C3 such that 

\92iy,t)\ < C2{yW^ + l)/t^ for ah < t < 00 and y G M, 
(1.17) 

with some < 1^2 < 00, 

(1.18) EC2{eQ) < CX3, 

with some < t'a < 00, 
and 

(1.20) ECsieo) < 00. 

We use w'i^{\i) to denote the row vector of the derivatives of Wfc(u) and ^^(u) 
the matrix of the second-order partial derivatives of Wk{u) (the Hessian 
matrix). Berkes, Horvath and Kokoszka (2003) showed that 

A = E{w'o{e)/MO)f{w'o{e)/wo{e)) 

exists and is nonsingular (T denotes the transpose). We also assume that 

(1.21) 0<Egl{eo,l)<(x^, 

(1.22) ^|g2(eo,l)|<oo and Eg2{eo,l)^0. 
If (1.21) and (1.22) hold, then 



0, z ^^gix^uj^y ^ 
<T = -— -, -TTT < 00. 



2_ Eg({eo,l) 



{Eg2{eo,l)f 

The multivariate normal distribution with mean and covariance matrix D 
will be denoted by N(0,D). 

Theorem 1.2. // (l.fi), (l.Q) and ("1. 8^-^1.22; hold, then 
ni/2(^^_6))^N(0,4r2A-^). 

This result will be proven in Section 4. 
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Remark 1.1. Let f{y) denote the density function of Eq and //(t) the 
Fisher information number of the scale family tf{xt), t > 0. If ei,...,e„ 
is known, then t„ = argmax{ni<i<n^/(^i^) :i > 0} can be used to estimate 
the scale parameter. One can verify that under suitable regularity conditions 
n^''^{in — 1) will be asymptotically normal with mean and variance r^. So 
by Lehmann [(1991), page 406] we conclude that 

(1.23) .'>^, 

and we have the equality in (1.23) when h = f . 

Remark 1.2. Newey and Steigerwald (1997) consider more general mod- 
els which include the GARCH(p, g) sequence. They point out that identifi- 
cation of the parameters in the drift term might be difficult. In our paper we 
study the estimation of the parameters in the error process of the Newey- 
Steigerwald model. This is the part which makes GARCH different from 
other time series. Our results cannot be applied directly to other versions of 
GARCH but our method can be used to investigate the properties of esti- 
mators in LGARCH [Bollerslev (1986)], NGARCH [Engle and Ng (1993)], 
MGARCH [Geweke (1986)], EGARCH [Nelson (1991)] and VGARCH [Engle 
and Ng (1993)]. 

Remark 1.3. Lee and Hansen (1994) assume that the observed se- 
quence yt is a stationary and ergodic martingale. They also assume that 

Eyl < oo. 

We do not impose this moment condition. Under our conditions we have 
only that 

£^|yo| < cxD, with some 5 > 0. 

It would be interesting and practically useful to extend the results of Lee 
and Hansen (1994) to the present situation. 

Remark 1.4. Drost and Klaassen (1997) showed that there is a reparametriza- 
tion of GARCH(1, 1) such that the efficient score functions in the parametric 
model of the autoregression parameters are orthogonal to the tangent space 
generated by the nuisance parameter, thus suggesting that adaptive esti- 
mation of the parameters is possible. Drost and Klaassen (1997) construct 
adaptive and hence efficient estimators in the reparametrized GARCH(1, 1) 
in a mean- type context. 

Next we consider three special choices of h. 
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2. Examples. 

Example 2.1. Let /i(t) = (27r)~^/^exp(— 1^/2) (the standard normal 
density function). Using this function in the definition of L„, we get the 
quasi- maximum hkehhood estimator investigated by Lee and Hansen (1994) 
and Lumsdaine (1996). Elementary calculations give that | \ogh{yt)\ < Co{y'^t'^ + 
1), 

gi{y, i) = (1 - y^t^)/t, \gi{t)\ < (1 + y^){l + t^)/t, 

g2iy,t) = -(1 + yh')/t^ \g2{t)\ < (1 + y^){l + t^)/t^ 

and 53 (y, t) = 2/t^. It is easy to see that t = I/Esq is the unique solution of 
the equation Egi{eo,t) =0 and Eg{eo,t) has a unique maximum at I/Eeq. 
If we assume that 

(2.1) Eel = l, 

then condition (1.16) is satisfied. We note that (2.1) is a standard condi- 
tion assumed by Lee and Hansen (1994) and Lumsdaine (1996). Clearly, 
g'i(eO) 1) = 1 — ^0 ^^^ 5'2(eo; 1) = — 1 — Sq. Hence (1-21) holds if and only if 
Ee^ < oo. Also, Eg2{eo, 1) = -2 by (2.1) and r^ = E{1 - eg)V4 = {Ee^ - 
l)/4. Hence the quasi-maximum likelihood estimator is almost sure consis- 
tent if i^l^Ql'^ < oo with some k> 1 and asymptotically normal if Eeq < oo. 

Example 2.2. Let h{t) = (l/2)exp(— |t|) (two-sided exponential distri- 
bution). Elementary calculations show that \\ogh{yt)\ < l + |y|i, E\logh{eQt)\ < 
l + t-Eleol, g(?/,i) = logi-log2-|y|t, 

gi{y,t) = (1 - \y\t)/t, \gi{t)\ < (1 + \y\){l + t)/t, 

92{y,t) = —l/t^ and g3{y,t) = 2/t^. Hence the unique solution of the equa- 
tion Egi{eo,t) = is t = l/i?|eo|) which will be 1 if and only if E\eo\ = 1. 
Assuming that 

(2.2) E\eo\ = l, 

we get that (1-16) holds. Clearly, gi{eo,l) = 1 — |eo| and S'2(£0)l) = — 1- 
Hence (1.22) is always satisfied and (1.21) holds if and only if Esq < oo. Also, 
T^ = £^(1 — |eo|)^ = Eeq — 1. Hence the exponential density based estimator is 
almost sure consistent if E\eQ\'^ < oo with some k > 1/2 and asymptotically 
normal if Eeq < oo. 

Example 2.3. Let h{t) = {(t? - 1)/2}(1 + |t|)-'' with some i? > 1. We 
note that E\logh{£ot)\ < Co(-Blog(|eo| + 1) + log(l + t) + 1), 
g{y, t) = logt + log((?? - l)/2) - ^log(l + \y\t), 
giiy,t) = l/t-^\y\/il + \y\t), \gi{y,t)\ <Ci/t, 

92{y,t) = -^ + M y \g2{y,t)\<C2/t^ 

V {l + \y\tY 
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and \g3{y,t)\ < C^/t^- The unique solution of the equation Egi{eo,t) = is 
t = 1 if and only if 

(2.3) El '^°' ^ ^ 



,l + |eo|y ^ 

and since g2{y,t) < 0, Eg{eQ,t) has a unique maximum at t = 1; that is, 
(1.16) holds. By (1.10) we have (1.21) and 

Eg2{eo,l) = -l + ^E(-^^y<-l + ^E(-^^) = 0, 

showing that (1.22) holds. Thus we can estimate 6 using this h as long as 
-EleQl*^ < oo with some k> 0. 

Example 2.4. Let h{t) = f{t), where / is the density function of sq. 
Since — log is strictly convex, Jensen's inequality shows that 

(2.4) Elog{tf{eot)/f{eo)} < logE{tf{eot)/f{eo)} = 
if 

(2.5) tf{eot)/f{eo) is nonconstant. 

If, following Lehmann [(1991), page 409], we assume that the distributions 
determined by the scale family of densities tf{yt), t > 0, are distinct, then 

(2.5) holds, with the exception of i = 1, and therefore (1.16) holds. Also, 
5(i(eo,l) = l + eo/'(eo)//(eo), 

and r^ = l//j(l), where If{t) is the Fisher information number of the scale 
family tf{yt), i > 0. In this case (1.13)-(1.19) are analogous to the conditions 
used by Lehmann (1991), Section 6.2, to establish the asymptotic normality 
of the maximum likelihood estimator of the scale parameter of the family 
tf{yt) based on independent, identically distributed observations. 

Condition (1.16) connects h and the distribution of the innovations. We 
have seen in Example 2.4 that (1.16) is always satisfied if the maximum like- 
lihood method is used. However, using another h, we may have to scale the 
model [cf. (2.1)-(2.3)]. Next we study the effect of scaling on the estimators 
and their asymptotic distributions. Let us assume that our model is 

(2.6) yk = ^k£k, 

(2.7) 4 = ^+ E «*2/L^+ E ^i4-r 

l<i<p ^^J1£q 
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The parameter of (2.6) and (2.7) is = (a;, ai, . . . , dtp, /3i, . . . , f3g). The scahng 
of ik wiU result in e^ = Sk/d, d > and a^ = da^- Thus (1.1) and (1.2) 
hold with = ((i^a),d^ai, . . . ,d'^ap,(3i, . . . ,(3q). We choose d such that (1-16) 
holds. By Theorem 2.2 we have that 

71^/2(0^ _5,)^N(0,4r2A-i). 
The definitions of Cj(u),0 <i < oo, yield that 

wk{e) wkiey 

where M = {M{i,j),0 < i,j < p + q}, M{i,j) =0 if i/j, M{i,i) = l/d'^ if 
<i <p and M{i, i) = 1 ii p <i <p + q. Hence 

\ \wkie)J Ufc(0)y 

and therefore 

n ((^0,n/rf , (^l,n/d ,.. ., Op^n/d , Op+i^n, ■ ■ ■ , dp+q,n) — ^) 

(2-8) ..,/. . ._fum\\ 



^^(oAMmyfm)). 



where 0n = (0o,n/d^,^i,n/d^, • ■ • ,^p,n/d^,^p+i,n, • • • ,^p+g,n)- The limit re- 
sult in (2.8) means that the only term which depends on h in the limit is 
T = T{£/d). So the efficiency of the estimator is determined by r only. 

Let us assume that the innovations ik in (2.6) and (2.7) are standard 
normal random variables. Using the quasi-maximum likelihood method of 
Example 2.1 (which is the likelihood method of Example 2.4 in this case), 
we get that t^^^i = 1/2. If we use the method of Example 2.2, we must 
rescale since it is assumed that the expected value of the absolute value of 
the innovations is 1, so the standard normal innovation must be divided by 
72/^. Hence r^^ = 7r/2 - 1. Clearly, t^^^^ < r^^p. 

Now we assume that the innovations £k are two-sided exponential random 
variables. In this case the methods of Examples 2.2 and 2.4 are the same 
and Tgxp = 1. If we use the method of Example 2.1, we need that the second 
moment is 1 , so the innovations must be divided by \/2 . Hence t?^^^^ = 5/4 
in Example 2.1. This means that the variance of the estimators for Pi, . . . ,(3q 
{f3i,. . . ,(3q are invariant for rescaling the innovations) will be 25% more if the 
quasi-maximum likelihood method is used instead of the likelihood method. 

Let £i be independent, identically distributed random variables with den- 
sity function f{t) = {{{} — 1)/2}(1 + |t|)~'', where "i? > 5. Elementary calcu- 
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lations show that E\ii\ = 1/(t? - 2),E\ei\'^ = 2/{{'& - 2){'d - 3)) and E\ei\^ = 
24/ ((t? — 2)['d — 3)('!9 — 4)('!9 — 5)). If the quasi-maximum hkehhood method 
is used, we use e, =ei/{E\ei\'^Y''^ and therefore 

^2 ^lf 6(^-2)(^-3) ^ 



' quasi 



4 I (t9-4)(??-5) 



If we use the method of Example 2.2, that is, the two-sided exponential 
density in the definition of I/„(u), we use the innovations Ei = ei/E\ei\ and 
we get 

2 ^ 2(^-2) 

^exp ^ _ 3 

Elementary calculations show that Tq^ggj > Tg^p for any i? > 5. If i? = 6, then 
"^quasi = 8. 75 while r^xp ~ 1.67. If 'd is large then t'^^^^i ~ 1.25 while r^^^p ~ 1- 
The parameters /?i , . . . , /3g are invariant for scaling, so the two-sided expo- 
nential method gives smaller asymptotic variance than the quasi-maximum 
method. We note that the likelihood method of Example 2.4 provides the 
smallest possible variance for the estimation of /3i, . . . ,/9q. However, this ex- 
ample illustrates that if the density is unknown and we suspect that the tail 
of the distribution of the innovations is polynomial, the two-sided exponen- 
tial method performs better than the quasi-maximum likelihood. 

If we are interested in the estimation of d in the examples above, we 
can use the residuals. The residuals are defined as ii = yi/wi{9n), 1 <i<n. 
Let us assume that the estimation is done under the scaling assumption 
(1.16). Then dn = {J2i<i<n^i^ /(''^ ~ 1))"^ can be used when we move to a 
model with scaling assumption eg = 1. However, replacing d with d in (2.8) 
will change the asymptotic variance. Using different /I's in L„(u), we study 
models based on different scaling assumptions. Since the parameters in (1.1) 
and (1.2) are not uniquely defined, scaling assumptions or reparametriza- 
tions [cf. Drost and Klaassen (1997) and Newey and Steigerwald (1997)] are 
required. 

3. Preliminary results. The first six lemmas are from Berkes, Horvath 
and Kokoszka (2003). 

Lemma 3.1. // the conditions of Theorem 1.1 are satisfied and ctq = 
i(;o(u*) with some u* G U , then u* = 6. 

Proof. This result is part of the proof of Lemma 5.5 in Berkes, Horvath 
and Kokoszka (2003). D 

Let log'''rc = log x if x > 1 and otherwise. 
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Lemma 3.2. Let LpQ,ipi,(p2, ■ . ■ be identically distributed random vari- 
ables satisfying Elog'^\ipQ\ < oo. Then '}2i<k<ooVkZ converges a.s. for all 

\z\<l. 

Lemma 3.3. // (1.5), (l.G) and (1.12) hold, then there is 5 > Q such 
that 



(3.1) 



E\yl\^ <oo and £'|o-o|^<oo. 



Lemma 3.4. // (l.h), (l.Q) and (IS) hold, then there are constants < 
C* , C** < OD and < p < 1 such that 



(3.2) 
and 



C*<u;fc(u)<C**(l+ Y. P'yl-X uef^' 

V l<j<oo / 

c*<wk{^)<c**(i+ Y. p'yl~X "£[/, 

V l<i<oo / 

for any —oo<k<oo. 

Lemma 3.5. If (1.2), (l.b), ^1.6^, ("l.S;, ^l.llj and ("1.12; hold, then, 
for any < k* < k. 



(gun - I 
ueuWk(u)J 



E[ sup 



< oo. 



\ueuwk[u), 

Lemma 3.6. // (1.5), (1.6), (1.8), (1.11) and (\.\2) hold, then 
El sup ''; ; ) < cx), 



E[ sup ; : 1 < oo 
VuGC/ Wk{\y) 



and 



E sn'p 



Wi, u 



Wk{\l) 



< oo 



for any k* > 0. 

For any u = (x,si, . . . ,Sp, ti, . . . ,tq) G [/ and 7 > 1, we define 
(3.3) C/(7,u) = |u* = {x*,sl, . . . ,s;,tt, . . .,t*) G t/: max^t*/tj < tJ. 



oo 
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Lemma 3.7. // (1.5), (1.6), (1.8), (1.11) and ("1.12; hold, then for any 
— oo<K*<oo there is 7 > 1 such that 

for all w^U and 1 < 7' < 7. 

Proof. Due to symmetry we can assume that k* > 0. We note that 
u/{l — po) < co(u) for ah u G [/ and < Ci(u*) < i^i7*Cj(u) with some con- 
stant Ki for all u* S [^(7, u) by Lemma 3.1 of Berkes, Horvath and Kokoszka 
(2003). Thus Lemma 3.7 wih be proven if we show that 

^/Er,.<oofc.(u),t. y*^^^_ 
Vl + Ei<i<ooCi(u)yjj„,y 

By Lemma 3.1 in Berkes, Horvath and Kokoszka (2003) there are constants 
c and < p < 1 such that 

(3.4) |cfc(u)| < cp'' for all u G [/ and k. 



For any Af > 1 we have 

Ei<»<oo7''c^(u)y| 

l+El<i<ooCi(u)y^_^ M<i<oo 



— <7^^+ E ^v^H^i 



<7'' + i^3 E (7P)*yLi, 

M<i<oo 

with some A'3 on account of (3.4). By the Markov inequality we have 



Ij\/<«<oo J 



< E nyL.>(i/2)(7P)~'(l-(7P)'/')(7P)^/'} 

A/<i<oo 
M<i<oo 

< E\yi\Hi - {ipf"r\i - {ipfh-\t/2)-\^pr'i\ 

Choosing M = log(t/2)/log7, t > 7^, we have, for any k* > 0, 
pj Ei<i<ooYci{u)yl^- \ 

iuG{/l + El<i<ooC*(u)y|_i J 

<^( E {ipyyU>t/2\ 
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< ^4 exp(- (5/2) ( 1 + log y9- V log 7) log(t/2)) 

if 7 > 1 is close enough to 1, where X4 and K^ are constants. This completes 
the proof of Lemma 3.7. D 

4. Proofs. 

Lemma 4.1. // (1.5), (1.6) and (1.8)-(1.13) hold, then L(u) is defined 
for any u G [/. 

Proof. By Lemma 3.3 there is < (5 < 1 such that E\yl\^ < 00 and 
therefore 



J„.2 



^ 1+ E p^yk-^ 



l<j<oo 



(4.1) 



< 



\ l<j<oo / 

l + E\yl\' Yl ip'y<^ 

l<i<oo 



for all < p <1. Therefore by Lemma 3.4 we have 



E sup I logtt;o(u)| < 00. 



(4.2) 

1 /2 

Since eo and cto/wq (u) are independent, by (1-13) we obtain 



E 



log/iUo— 



o-Q 



1/2/ \ 
Wq' (u) 



<Co[l + E 



t2 \ uo/2 



wo{u) 



Using Lemma 3.5 and (4.2), we conclude 



(4.3) 



E'sup 



log<j ^7:^ ^' 



1/2. .'"{ 1/2. . 

Wq (U) \Wq' (U) 

and thus Lemma 4.1 is proved. D 



< 00, 



Let 



LnH = - E iog{^72r 

"l<fc<r,. '^Wj U 



Vk 



k (u) ^«^fe^^(u) 
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Lemma 4.2. // (1.5), (1.6) and (1.8)-(1.14) hold, then 



E< sup 

I |u — v| 



|L„(u) -L„(v)| ■.ueU,veU} <oo. 



Proof. By the mean value theorem there is a random variable ij £U 
such that 

1/2/ \\ / / 1/2/ 



\g{ek,crk/Wf^' {u))-g{ek,crk/Wf^' (v))| 



1, 

— u 
2' 



9i{ek,(rk/wl^'^{r])) 



<^k 



wl^^iv) 



w'kiv) 



So by condition (1.14) we conclude 



supi — 
u,v u- 



9\ ^fc, 



o-fc 



1/2/ \ 
Wk (u) 



9\ Sfc, 



o-fc 



u; 



1/2, 



< 7TCl(^fc)sup 



1 



(^k 



1/2/ \ 



Ot. 



+ 1 



0-fc 



0-fc w'i.{r]) 



<^<^i(efc)isup 

2 I u \\Wk{vL) 



i^i/2 



+ 1 



sup 



u;^/'(77)/JTz;^/'(T7)^fc(^) 



Wk{-v) 



We note that Sk and {(T^/w;fc(u),u e [/} are independent for any k. The 
Holder inequality and Lemmas 3.5 and 3.6 yield 

v2 \ !^i/2 



I- / / ^2 x!.i/2 



Wt, V 



w^fc(v) 

7^1 1/7 



^^'•^'•{KtIi^) ' +0 } h 



< 



^fc(v) 



Wfc(v) 



7'. 1/7' 



with some constant Ki^ where 1 < 7,7' < 00 satisfy (2^1/2)7 < k and I/7 + 
1/7' = 1. Since -Ln(u) is the average of stationary random variables, the 
proof of Lemma 4.2 is now complete. D 

Lemma 4.3. // (1.5), (1.6) and (1.8)-(1.14) hold, then 
nsup |I/„,(u) — L„(u)| = 0(1) a.s. 

Proof. We use (3.4). Let 

l<i<oo 

We note that by Lemmas 3.2 and 3.3 the series defining ^ converges a.s. 
Using the definitions of Wfc(u), t()A:(u) and (3.4), we conclude 



(4.4) 



sup |u;fc(u) - u;fc(u)| < c ^ P^vl^i = c/C- 



uel/ 



k<i<oo 
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•1/2, 



1/2, 



Using next the mean value theorem, there is ry G {ak/w^ {\Ji),ak/w^ (u)) 
such that 



1/2/ 



,7,1/2/ 



\g{ek,crk/Wf,' {w))- g{£k,(Tk/Wk ("))! 



9i{£k,ri) 



w 



1/2 



(u) 



W 



7,1/2 



(u) 



Applying condition (1.14) and Lemma 3.4, we get 

O'k CTk 



9ii£k,v) 



w 



1/2, 



u) twj/^u) 



<Ci(efe){(7fi+l)Mafc 



Wk{u) -Wkiu) 



Wu 



"^<vC)w]!\v.){w]!\n) + w]!\vi)) 



<Ci{ek) 



o-fc 



w 



1/2 



(u) 



+ 1 



w 



1/2 



(u) 0-fc \wk{\i)-Wk{n) 



0"fc 



li) 



1/2 



<K2Ci(efc)K^ + l)^p'^ 



(u) 



2C* 



for any /c. Applying (4.2) with u = 6, we see that E\ loguol < cxd. We can 
assume without loss of generality that Ci in (1.14) is larger than 1 and thus 
by (1.15) we conclude that E\ log(Ci(eo)(o'o^ + l))l < °*^- Thus we can apply 
Lemma 3.2 to get 



n 



sup|L„(u)-I„(u)|<K2^ ^ Ci(efc)«i+l)p'=<cx) a.s. 



uGC/ 



l<A:<oo 



D 



Proof of Theorem 1.1. Since log{w^^''^ {u)h{ykw'^^''^ (u))) is a sta- 
tionary sequence with finite mean L{u) and by Theorem 3.5.8 of Stout (1974) 
it is also ergodic, the ergodic theorem implies that Ln{u) — > L{u) a.s. for 
any fixed u E [/. Thus Lemma 4.2 yields 



sup|L„(u) - L{u)\ - 
uet/ 

Using now Lemma 4.3, we conclude that 



(4.5) 



sup \Ln{u) - L{u) 







•0 



a.s. 



a.s. 



We note that 

(4.6) L{e) - L(u) = E{g{eo, 1) - <7(eo,^o/^'/'(u))}. 

1 /2 

Since eo and {cto/iuq (u),u G U} are independent, by (1.16) we have that 



L{6) > L{u) and we have that L{0) = L{u) if and only if (Tq ■ 



w, 



1/2 



(u). Using 



Lemma 3.1, we get that L{u) has a unique maximum at 6. The function 
L(u) is continuous, and thus the uniform a.s. convergence of I/n(u) to ^^(u) 
implies ^„ — > 0, proving Theorem 1.1. 
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Since 



Lniu) = - V g{ek,(Jk/wl''^{u)) V logak, 



l<k<n 



l<k<n 



we get that 



(4.7) .;(u)4 i: ,,.„.,/.;/=(u))(4 - ^ 



l<fc<n 



and 



^nH = - II 92{ek,(yk/wJ (u)) 



l<fc<n 



(4.8) 



1 CTfc W^(u)\'^/ 1 Gk W',^{\1 



2u;J/^(u)^fc(u)/ V 2y;i/2(u)u;fc(u 



- XI 9i{£k,cFklw^ (u)) 



l<A;<n 



3 0-fc /w;,(u)\ u;^,(u) 1 0-fc <(u^ 



Similarly, 

(4.9) 

and 



L'{vL) = Egi{eQ,aQ/wQ' (u)) 



4t(;^/^(u) Vw^fc(u)/ u;fc(u) 2y^i/2(u)'ujfc(u);' 
1 do u;o(u) 



2 1/2 



u;o/^(u) ^o(u) 



L"(u) = i^|52(eo,^o/^y'(u)) 



1 o-Q 'u;^(u)\^/ 1 (To Wo(u) 



2li;o/^(u)^«o(u)y V 2y;l/2^^)u;o(u) 



/- N . , 1/2/ xn/3 (To f w'c,{'Vi)\ w'niu) 

4.10 +gi £0,tT0>0^ U 7 1/2 MH ^M 

v4y;y2(^^Vwo(u)y ■wo(u) 

_ 1 q-0 t^o(u) 

2u;y^(u)^o(u) 

The expected value in (4.9) exists, since by (1.14) and (1.15) and the inde^ 
pendence of eq and aQ/w^''^{u) we have 



E 



ctq 

51 1^0,^72 



O"0 Wq[u] 



(4.11) 



<i^ci(.o)ff^^^VVi 



<£;Ci(eo)^ 



^0 (u) 



w'q{u) 



Wq{\i) 



1/2 



-lUn U 



+ 1 



Wn U 



w;o(u) 



< CX) 
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on account of the Holder inequality and Lemmas 3.5 and 3.6. A similar 
argument shows that the expected value in (4.10) also exists for all u G [/. 

D 

Lemma 4.4. // (1.5), (1.6), (1.8)-(1.14), (1.17) and (1.18) hold, then 
there exists U* , a neighborhood of 6, such that 



(4.12) 



sup |i^(u) — L'{\y)\ -^ a.s. 

u£U* 



If, in addition, (1.19) and (1.20) are satisfied, then 
(4.13) sup |L;;(u)-L"(u)|^0 a.s. 

Also, E{{w'q{u) / WQ{\i)) w'q{u) / wq{\\)^ is a nonsingular matrix for any u G 

U*. 

Proof. Let U* = U{'y,6) with some 7 to be chosen later. Applying 
(1.17), we obtain 

/ CTk 

V V (u) 



(4.14) 



52 £k 



<C2{ek) 



(^k 



1/2/ N 

wj U 






'^^ wf^{u) 



CTk 



1/2/ \ 

w' (u) 



U2 



+ 1 



0-fc 



1/2/ ^ 

Wk (u) 



<.(u) 



Wk{u.) 



Using (4.7), (4.14) and conditions (1.14) and (1.17), we get 



\d-u 



■n\L'^{e) - L'^{u) 



< 



\0-u\ 



+ 



E 

l<fc<n 



^E 



9i[ek,—[j^ 



<' (0) 



(^k 



e-u 



l<k<n 



51 '^k,- 



<^k 



< ^ C2[£k){ sup 



l<fc<n 



1/2/ 



0-fc 




w]^^{e)Wk{0) w]j'^{vL)Wk{^) 



1/2/ 



vef/*VVu;,/^(v) 



+1 



' V / /u 



+ ^ Ci(efc)( sup 



l<fc<n 



vec/AVu;i/'(v) 



+1 



V (v) 
,1/2 



sup 

6(7 



<(u) 



u^fc(u) 



w^fc (v) 
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X sup 



0"fe 



sup 



zGU* WjJ (z) uGC/ \2 

E (4,1 + 4,2). 



<i^) 



Wk{u) 



<(u) 



tt;fc(u) 



l<fc<n 



Using the Cauchy inequality and the independence of e^ and {wk{u),u G U}, 
we conclude from Lemmas 3.7 and 3.6 that 



EIk,i = EC2iek)E[ sup 

vvGf/* 



<EC2{ek){E[ sup 



o-fc 



1/2, 



+ 1 



o-fc 



-1 



CTfc 



X ( £; sup 






^2 



+ 1 



1/2^ \ 
V (v) 

1/2/ \ 



sup 

ue(7 



^«^. u 



1\ 2\ 1/2 



< oo, 



u;fc(u) 

provided 7 > 1 is chosen close enough to 1. Similarly, 

E'4,2 < 00. 

Thus Ik,i and Ik,2 are stationary sequences with finite expectations and by 
Theorem 3.4.8 of Stout (1974) they are also ergodic. Hence our previous 
estimates and the ergodic theorem imply 

1 



(4.15) 



^^P Jh 1 



\L'^{e)-L'^{M)\ = 0{l) a.s. 



Since -^^(u) is an average of a stationary, ergodic sequence with finite ex- 
pectation [cf. (4.11)], another application of the ergodic theorem gives, for 
any u G t/* , 

(4.16) L'^{m) ^ L' {u) a.s. 

Putting together (4.15) and (4.16), we get (4.12). Similar arguments yield 
(4.13). 

Berkes, Horvath and Kokoszka (2003) proved that E{wq{u))'^ Wq{\i) / Wq{u) 
is a continuous function and it is nonsingular at u = 0, so the proof of Lemma 
4.4 is complete. D 

Let On = argmax{L„(u) : u G [/}. 

Lemma 4.5. // (1.5), (1.6) and (^1.8)-(1.22) are satisfied, then 



n 



"HOn-e) 



(4.17) 



n 



172 2^ 9i{ekA 



l<k<n 



Eg2{eo,l) Wk{9 



"^^^^A"Hl + o(l)) a.s. 
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and 

= —rn Z^ 5i(efc,lJTr-7 — tt — t^A +op(lj. 
^ i<fc<n ^52(eo,l)w^fc(0) 

Proof. We showed [cf. (4.9) and (4.11)] that L(u) is differentiable for 
all u G [/ and proved after (4.6) that L{'\i) has a unique maximum at u = 0. 
Thus L'{6) = 0. By the independence of Eq and w'q{0) / wo{6) and (4.9), we 
have 

Berkes, Horvath and Kokoszka (2003) showed Ew'q{0)/wq{0) ^ 0, so we 
have 

(4.19) Egi{eo,l)=0. 

By (4.5) it follows easily that 6n^ a.s. Hence there is a random variable 
riQ such that On &U* if n > no, where U* is defined in Lemma 4.4. Clearly, 
U* C U is compact and for 7 > 1 sufficiently close to 1 it does not have 
common points with the boundary of U. Since Ln{u) is twice differentiable 
and it reaches a maximum at On, we have 

(4.20) L'niOn) = ifn>no, 
and thus 

L'n{On)-L'n{0) = -L'n{0). 

By (4.13) we have that L'n{On) - L'n{0) = {On - 0){L"{0) + o(l)) a.s. Ob- 
serving that L"{0) = Eg2{eo,l)jA, and using (4.7), the proof of (4.17) is 
complete. By the orthogonality and stationarity of the summands in (4.17), 
and in view of (1.21), the variance of the sum is 0(l/n) and therefore (4.18) 
follows from (4.17). D 

A simple calculation shows, in analogy with (4.7), 

(4.21) i;(u) = i E M.,.a,/^4'\u))U-^^ 

^l<k<n ^ ^WjJ (u)^fclUj 

Lemma 4.6. // (1.5), (1.6) and (1.8)-(1.22) are satisfied, then 

(4.22) sviv\L'n{^)-L'n{vL)\=o(-\ a.s. 
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Proof. Berkes, Horvath and Kokoszka (2003) showed that there are 
constants c and < p* < 1 such that 

|c:(u)|<cp: and |cf(u)|<cpt 

for all u G [/ and <i < cxo. Hence 



(4.23) 



sup|wfc(u)-w;^(u)| <c J2 Plyl~i = cpli*, 



ue!7 



k<i<oo 



where ^* = J2i<i<oo Plvl-i converges a.s. by Lemmas 3.2 and 3.3. By (1.14), 
(1.17), (4.7), (4.21) and Wk{u) <Wkiu), we have 



n|L^(u) - Ln{u) 



< E 

l<fc<n. 



51 ^fcr 



o"fc 



+ E 

l<fc<n 



91 ^k, 



.1/2/ N 



51 ^fc 



Cfk 



(Tk W^{M) 



u;^/^(u)'»^fc(u) 



.1/2. X 



1/2/ \ 

0-fc u;;,(u) cjfc w'^{u 



w 



< E ^2(^.) 
1<A:<71 



0"fc 



n1/2/ 



Wk (u) 



+ 1 



V2(u)u;fe(u) 



w 



1/2 



(u) Wkiu) 



1/2. ^ -1/2/ N 



Wk (u) 
0-fc u;[,(u) 



u;^/^(u)w^fc(u) 



+ E Ciisk) 

l<k<n 



0"fc 



X O-fc 



.1/2/ s 



0-fc 



.1/2/ X 
^k (u) 



.3/2/ s 3/2/ s 

V (u) V (u) 

= -'n,l(u) + J„,2(u). 

By Lemma 3.4 and (4.4), using ^ and p in the proof of Lemma 4.3, we get 



Jn,l(u)< J2 C2{ek) 



l<k<n 



0"fe 



.1/2/ ^ 



+ 1 



0"fc 



1/2/ \ 

Wk (u) 



-1 



w'kH 



Wk{u) 



0-fc 



o^fc 



1/2 



1/2/ 



V (") ^fc (") 



< E ^2(£fc) 



l<fe<n 



0-fc 



.1/2/ ^ 

wj U 



+ 1 



<(u) 



it;fc(,uj 



i/;fc(u) -i(;a:(u) 



•1/2/ 



,1/2 



1/2/ 



V (u)(V (u)+V (u)) 
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<i^3 E C2(efc)K^ + l)sup 

l<k<n "G^ 

<K3e E C2(efc)K'+l)sup 



wi, u 



Wfc(u) 



sup|'u;a,.(u) -Wk{u) 



Wk{u) 

w'ki^'. 



l<fc<oo 



u€U 



Wk{u) 



p <oo 



a.s. 



In the last step we also used Lemma 3.2 and the observation that C2(eA:)(Cfc^ + 
1) X supugj; \w'i^{'ii)/wk{u)\ is a stationary sequence with 



E 



log C2(efc)K^ + l)sup 



ueu 



<i^) 



Wk[U) 



< oo. 



Hence supugjy Jn,i = 0(1) a.s. Replacing (4.4) with (4.23), similar arguments 
show that supug^ J„^2 = 0{1) a.s., completing the proof of (4.22). D 

Lemma 4.7. // (1.5), (1.6) and (1.8)-(1.22) are satisfied, then 

1~ 



(4.24) 



\(^n — (^n\ — O 



n 



a.s. 



Proof. Similarly to the proof of Lemma 4.5 there is a random variable 
no such that 

(4.25) L'^X^n) = and 0„ G t/* ifn>no, 

where the set U* is defined in the proof of Lemma 4.5. By (4.13) we have 

L'^iOn) - L'^iOn) = {On " 0n)^"(0)(l + o(l)) a.S. 

and therefore 

(On - On) = (L'niOn) " I'^iO „))iL" {e)r\l + o(l)) a.S. 

We recaU that L^(0„,) = 0. Lemma 4.6 and (4.25) yield that L^(0„) = 
L'nidn) + 0{l/n) = 0{l/n) a.s., completing the proof of Lemma 4.7. D 

Proof of Theorem 1.2. By Lemma 4.7, relation (4.18) remains valid 
if we replace On by On- Observe now that gi{ek,l)w'i^{6)/wk{0) is a station- 
ary martingale difference sequence with respect to the a-algebra generated 
by {ej,j <A;}. By Theorem 3.4.8 of Stout (1974) it is also ergodic. Using 
the Cramer- Wold device [cf. Billingsley (1968), page 49] and Theorem 23.1 
of Billingsley [(1968), page 206], we obtain the multivariate central limit 
theorem expressed by Theorem 1.2. D 
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